Pre x Probabilities from Stochastic Tree Adjoining Grammars

نویسندگان

  • Mark-Jan Nederhof
  • Anoop Sarkar
  • Giorgio Satta
چکیده

Language models for speech recognition typically use a probability model of the form Pr(anja1; a2; : : : ; an 1). Stochastic grammars, on the other hand, are typically used to assign structure to utterances. A language model of the above form is constructed from such grammars by computing the pre x probability P w2 Pr(a1 anw), where w represents all possible terminations of the pre x a1 an. The main result in this paper is an algorithm to compute such pre x probabilities given a stochastic Tree Adjoining Grammar (TAG). The algorithm achieves the required computation in O(n6) time. The probability of subderivations that do not derive any words in the pre x, but contribute structurally to its derivation, are precomputed to achieve termination. This algorithm enables existing corpus-based estimation techniques for stochastic TAGs to be used for language modelling.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Preex Probabilities for Linear Indexed Grammars

We show how preex probabilities can be computed for stochastic linear indexed grammars (SLIGs). Our results apply as well to stochastic tree-adjoining grammars (STAGs), due to their equivalence to SLIGs.

متن کامل

Prefix probabilities for linear indexed grammars

vVe show how prefix probabilities can be computed for stochastic linear indexed grammars (SLIGs). Our results apply as weil to stochastic tree-adjoining grammars (STAGs), due to their equivalence to SLIGs.

متن کامل

Prefix Probabilities from Stochastic Tree Adjoining Grammars

Language models for speech recognition typically use a probability model of the form Pr(an|a1, a2, . . . , an−1). Stochastic grammars, on the other hand, are typically used to assign structure to utterances. A language model of the above form is constructed from such grammars by computing the prefix probability ∑ w∈Σ Pr(a1 · · · anw), where w represents all possible terminations of the prefix a...

متن کامل

Stochastic Categorial Grammars

Statistical methods have turned out to be quite successful in natural language processing. During the recent years, several models of stochastic grammars have been proposed, including models based on lexicalised context-free grammars [3], tree adjoining grammars [15], or dependency grammars [2, 5]. In this exploratory paper, we propose a new model of stochastic grammar, whose originality derive...

متن کامل

Prefix Probabilities for Linear Context-Free Rewriting Systems

We present a novel method for the computation of prefix probabilities for linear context-free rewriting systems. Our approach streamlines previous procedures to compute prefix probabilities for context-free grammars, synchronous context-free grammars and tree adjoining grammars. In addition, the methodology is general enough to be used for a wider range of problems involving, for example, sever...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998